Robust Elastic Net Regression

نویسندگان

  • Weiyang Liu
  • Zhiding Yu
  • Meng Yang
چکیده

We propose a robust elastic net (REN) model for high-dimensional sparse regression and give its performance guarantees (both the statistical error bound and the optimization bound). A simple idea of trimming the inner product is applied to the elastic net model. Specifically, we robustify the covariance matrix by trimming the inner product based on the intuition that the trimmed inner product can not be significant affected by a bounded number of arbitrarily corrupted points (outliers). The REN model can also derive two interesting special cases: robust Lasso and robust soft thresholding. Comprehensive experimental results show that the robustness of the proposed model consistently outperforms the original elastic net and matches the performance guarantees nicely. Introduction Over the past decades, sparse linear regression has been and is still one of the most powerful tools in statistics and machine learning. It seeks to represent a response variable as a sparse linear combination of covariates. Recently, sparse regression has received much attention and found interesting applications in cases where the number of variables p is far greater than the number of observationsn. The regressor in high-dimensional regime tends to be sparse or near sparse, which guarantees the high-dimensional signal can be efficiently recovered despite the underdetermined nature of the problem [3, 7]. However, data corruption is very common in highdimensional big data. Research has demonstrated the current sparse linear regression (e.g. Lasso) performs poorly when handling dirty data [6]. Therefore how to robustify the sparse linear regression becomes a major concern that draws increasingly more attentions. Robust sparse linear regression can be roughly categorized into several lines of researches. One is to first remove the detected outliers and then perform the regression. However, outlier removal is not suitable for the high-dimensional regime, because outliers might not exhibit any strangeness in the ambient space due to the high-dimensional noise [22]. Another type of approaches include replacing the standard mean squared loss with a more robust loss function such as trimmed loss and median squared loss. Such approaches usually can not give performance guarantees. Methods [12, 15, 13] have been developed to handle arbitrary corruption in the response variable, but fail with corrupted covariates. [14] proposes a robust Lasso that considers the stochastic noise or small bounded noise in the covariates. [5] also considers similar corruption settings and proposes a robust OMP algorithm for Lasso. For the same noise, [17] proposes the matrix uncertainty selector that serves as a robust estimator.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Quantile Regression with Adaptive Elastic Net Penalty for Longitudinal Data

Longitudinal studies include the important parts of epidemiological surveys, clinical trials and social studies. In longitudinal studies, measurement of the responses is conducted repeatedly through time. Often, the main goal is to characterize the change in responses over time and the factors that influence the change. Recently, to analyze this kind of data, quantile regression has been taken ...

متن کامل

Robust Sparse PCA via Weighted Elastic Net

In principal component analysis (PCA), `2/`1-norm is widely used to measure coding residual. In this case, it assume that the residual follows Gaussian/Laplacian distribution. However, it may fail to describe the coding errors in practice when there are outliers. Toward this end, this paper propose a Robust Sparse PCA (RSPCA) approach to solve the outlier problem, by modeling the sparse coding ...

متن کامل

Robustness of Sparse Regression Models in fMRI Data Analysis

The primary goal of fMRI analysis is the identification of brain locations, or image voxels, associated with cognitive tasks of interest. Predictive modeling techniques have become widely used in such analysis and sparse modeling techniques have become especially attractive due to their ability to identify relevant locations in a multivariate manner. However, many sparse methods may be too cons...

متن کامل

Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net

Regression shrinkage and variable selection are important concepts in high-dimensional statistics that allow the inference of robust models from large data sets. Bayesian methods achieve this by subjecting the model parameters to a prior distribution whose mass is centred around zero. In particular, the lasso and elastic net linear regression models employ a double-exponential distribution in t...

متن کامل

Robust Detection of Impaired Resting State Functional Connectivity Networks in Alzheimer's Disease Using Elastic Net Regularized Regression

The large number of multicollinear regional features that are provided by resting state (rs) fMRI data requires robust feature selection to uncover consistent networks of functional disconnection in Alzheimer's disease (AD). Here, we compared elastic net regularized and classical stepwise logistic regression in respect to consistency of feature selection and diagnostic accuracy using rs-fMRI da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1511.04690  شماره 

صفحات  -

تاریخ انتشار 2015